Python is a dynamic, interpreted (bytecode-compiled) language. There are no type declarations of variables, parameters, functions, or methods in source code. This makes the code short and flexible, and you lose the compile-time type checking of the source code. Python tracks the types of all values at runtime and flags code that does not make sense as it runs.
An excellent way to see how Python code works is to type it into a notebook.
In [ ]:
a = 6 ## set a variable in this interpreter session
a ## entering an expression prints its value
In [ ]:
a + 2
In [ ]:
a = 'hi' ## 'a' can hold a string just as well
a
In [ ]:
len(a) ## call the len() function on a string
In [ ]:
a + len(a) ## try something that doesn't work
In [ ]:
a + str(len(a)) ## probably what you really wanted
In [ ]:
foo ## try something else that doesn't work
As you can see above, it's easy to experiment with variables and operators. Also, the interpreter throws, or "raises" in Python parlance, a runtime error if the code tries to read a variable that has not been assigned a value. Like C++ and Java, Python is case sensitive so "a" and "A" are different variables. The end of a line marks the end of a statement, so unlike C++ and Java, Python does not require a semicolon at the end of each statement. Comments begin with a '#' and extend to the end of the line.
Python source files use the ".py" extension and are called "modules." With a Python module hello.py, the easiest way to run it is with the shell command "python hello.py Alice" which calls the Python interpreter to execute the code in hello.py, passing it the command line argument "Alice". See the official docs page on all the different options you have when running Python from the command-line.
Here's a very simple hello.py program (notice that blocks of code are delimited strictly using indentation rather than curly braces — more on this later!):
#!/usr/bin/env python
# import modules used here -- sys is a very standard one
import sys
# Gather our code in a main() function
def main():
print 'Hello there', sys.argv[1]
# Command line args are in sys.argv[1], sys.argv[2] ...
# sys.argv[0] is the script name itself and can be ignored
# Standard boilerplate to call the main() function to begin
# the program.
if __name__ == '__main__':
main()
Open a terminal window and paste this code into the file hello.py, then run the program a few times.
The outermost statements in a Python file, or "module", do its one-time setup — those statements run from top to bottom the first time the module is imported somewhere, setting up its variables and functions. A Python module can be run directly — as above "python hello.py Bob" — or it can be imported and used by some other module. When a Python file is run directly, the special variable __name__
is set to __main__
. Therefore, it's common to have the boilerplate
if __name__ ==...
shown above to call a main()
function when the module is run directly, but not when the module is imported by some other module.
In a standard Python program, the list sys.argv
contains the command-line arguments in the standard way with sys.argv[0]
being the program itself, sys.argv[1]
the first argument, and so on. If you know about argc
, or the number of arguments, you can simply request this value from Python with len(sys.argv)
, just like we did above when requesting the length of a string. In general, len()
can tell you how long a string is, the number of elements in lists and tuples (another array-like data structure), and the number of key-value pairs in a dictionary.
In [ ]:
# Defines a "repeat" function that takes 2 arguments.
def repeat(s, exclaim):
"""
Returns the string 's' repeated 3 times.
If exclaim is true, add exclamation marks.
"""
result = s + s + s # can also use "s * 3" which is faster (Why?)
if exclaim:
result = result + '!!!'
return result
Notice also how the lines that make up the function or if-statement are grouped by all having the same level of indentation. We also presented 2 different ways to repeat strings, using the + operator which is more user-friendly, but * also works because it's Python's "repeat" operator, meaning that '-' * 10 gives '----------', a neat way to create an onscreen "line." In the code comment, we hinted that * works faster than +, the reason being that * calculates the size of the resulting object once whereas with +, that calculation is made each time + is called. Both + and * are called "overloaded" operators because they mean different things for numbers vs. for strings (and other data types).
The def
keyword defines the function with its parameters within parentheses and its code indented. The first line of a function can be a documentation string ("docstring") that describes what the function does. The docstring can be a single line, or a multi-line description as in the example above. (Yes, those are "triple quotes," a feature unique to Python!) Variables defined in the function are local to that function, so the "result" in the above function is separate from a "result" variable in another function. The return statement can take an argument, in which case that is the value returned to the caller.
Here is code that calls the above repeat()
function, printing what it returns:
In [ ]:
def happy(name):
print repeat(' Yay ' + name, False)
print repeat('Woo Hoo', True)
At run time, functions must be defined by the execution of a "def" before they are called.
In [ ]:
happy('Tobi')
One unusual Python feature is that the whitespace indentation of a piece of code affects its meaning. A logical block of statements such as the ones that make up a function should all have the same indentation, set in from the indentation of their parent function or "if" or whatever. If one of the lines in a group has a different indentation, it is flagged as a syntax error.
Python's use of whitespace feels a little strange at first, but it's logical and I found I got used to it very quickly. Avoid using TABs as they greatly complicate the indentation scheme (not to mention TABs may mean different things on different platforms). Set your editor to insert spaces instead of TABs for Python code.
A common question beginners ask is, "How many spaces should I indent?" According to the official Python style guide (PEP 8), you should indent with 4 spaces. A good IDE should take care of this for you.
In [ ]:
def happy(name):
if name == 'Tobi':
print repeet(' Yay ' + name, False)
else:
print repeat('Woo Hoo', True)
The if-statement contains an obvious error, where the repeat() function is accidentally typed in as repeet(). The funny thing in Python ... this code compiles and runs fine so long as the name at runtime is not 'Tobi'. Only when a run actually tries to execute the repeet() will it notice that there is no such function and raise an error. This just means that when you first run a Python program, some of the first errors you see will be simple typos like this. This is one area where languages with a more verbose type system, like Java, have an advantage ... they can catch such errors at compile time (but of course you have to maintain all that type information ... it's a tradeoff).
In [ ]:
happy('Cody')
In [ ]:
happy('Tobi')
Since Python variables don't have any type spelled out in the source code, it's extra helpful to give meaningful names to your variables to remind yourself of what's going on. So use "name" if it's a single name, and "names" if it's a list of names, and "tuples" if it's a list of tuples. Many basic Python errors result from forgetting what type of value is in each variable, so use your variable names (all you have really) to help keep things straight.
As far as actual naming goes, some languages prefer underscored_parts for variable names made up of "more than one word," but other languages prefer camelCasing. In general, Python prefers the underscore method but guides developers to defer to camelCasing if integrating into existing Python code that already uses that style. Readability counts. Read more in the section on naming conventions in PEP 8.
As you can guess, keywords like 'print' and 'while' cannot be used as variable names — you'll get a syntax error if you do. However, be careful not to use built-ins as variable names. For example, while 'str' and 'list' may seem like good names, you'd be overriding those system variables. Built-ins are not keywords and thus, are susceptible to inadvertent use by new Python developers.
Suppose you've got a module "binky.py" which contains a "def foo()". The fully qualified name of that foo function is "binky.foo". In this way, various Python modules can name their functions and variables whatever they want, and the variable names won't conflict — module1.foo is different from module2.foo. In the Python vocabulary, we'd say that binky, module1, and module2 each have their own "namespaces," which as you can guess are variable name-to-object bindings.
For example, we have the standard "sys" module that contains some standard system facilities, like the argv list, and exit() function. With the statement "import sys" you can can then access the definitions in the sys module and makes them available by their fully-qualified name, e.g. sys.exit(). (Yes, 'sys' has a namespace too!)
In [ ]:
import sys
In [ ]:
# Now can refer to sys.xxx facilities
sys.maxint
There is another import form that looks like this: "from sys import argv, exit". That makes argv and exit() available by their short names; however, we recommend the original form with the fully-qualified names because it's a lot easier to determine where a function or attribute came from.
There are many modules and packages which are bundled with a standard installation of the Python interpreter, so you don't have do anything extra to use them. These are collectively known as the "Python Standard Library." Commonly used modules/packages include:
You can find the documentation of all the Standard Library modules and packages at http://docs.python.org/library.
Inside the Python interpreter, the help() function pulls up documentation strings for various modules, functions, and methods. These doc strings are similar to Java's javadoc. The dir() function tells you what the attributes of an object are. Below are some ways to call help() and dir() from the interpreter:
In [ ]:
help(len)
In Jupyter notebook you can also find help on a function by pressing shift-TAB after a function's first parenthesis.
In [ ]:
len( <PRESS SHIFT-TAB BEFORE THIS>
Pressing shift-TAB twice in succession makes the documentation box bigger (showing more of the docs).
Recall our string "a".
In [ ]:
a
In [ ]:
dir(a)
The above output is annoyingly long - click in the margin under the Out[] to make it smaller.
In [ ]:
In [ ]:
In [ ]:
Note: This notebook is an adaption of Google's python tutorial https://developers.google.com/edu/python
In [ ]: